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Abstract 

If a linear program (LP) possesses a large generalized network (GN) subma- 
trix, this structure can be exploited to decrease solution time. The problems of 
finding maximum sets of GN constraints arrJ finding maximum embedded G!\ 
submatricps are shown to be NP-com{)Icdp iridicalmg tha( reliable, c'fTic'ierit sr)lu- 
tion of these problems is difficult. Therefore, efficient heuristic algorithms arc 
developed for identifying such stnrdure and arc tc‘slcd on a sclnchon of 
twenty-three real-world problems. The best of four algorithms for irt •nlifying 
constraint sets finds a set whic'h is maximum in twelve cases and a.vf'ragr^s 99.1 T 
of maximum. On average, the GN corislroinls idenlified comprise more thin 
62.3% of the total constraints in these problems. The algorithm for identifying 
embedded ON submatriccs finds submatrices whose sizes, rows plus columns, 
average 96.8% of an LP upper bound. Over 91.3% of the total constraint matrix 
was identified as a GN submatrix in these problems, on average. 

"The act of being wise is the act of knowing what to overlook." 

William James (ca. 1890) 



1. Jiilrodactiorj 

Large-scale linear programming (LP) models frequently have sparse 
cccfficicnt matrices with special structure. If special structure can be 
identified, it con often be exploited to reduce the cost of sohdng the LP. "Direct 
factorization," c.g. [13], maintains a partitioning of the rows and/or columns of 
all simplex bases. Computations are reduced with respect lo standard method? 
if special structure can be isolated within the p«.\rtitions. "Decomposition," e.g. 
[14], splits a problem into a master problem and one or more subprobhuTis. This 
technique is most efficient when subproblems consist cniir(4y of special struc- 
ture allowing their rapid solution. The details of these exploitation schemes will 
not be discussed here. 
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Useful structures found rrnbc'dded in a subset of the rows and/or columns 
of an l.F constraint matrix iru iudc simple upper bounds (at most one nonzero 
element in each row), generatized upper bounds (GUH) (at most one nonzero 
coefTioient in each column), and networks (at most two nonzero elements in 
Cvu:h colunin). Varieties of cmbeddc'd networks iru:lude the general case, gem 
cralizc.d networks (GN), generalized transshipment networks (GT) (at most one 
coefTicicnl not. equal to +1), and networks (NET) (at most one +1 and one -1 
in each column). 

Simple' upper bounds. GUF3 and NET structures Imvc bemn exploited in vari- 
ous comniei'ciai and experimental optimization S}steiirv tuid ufTieieiit aiitomatie 
identifieat ion schc'incs have been dc'veloped to find llu'se structures, e.g., 
[4,7.0], 

Recent research has produced very cfTicicnt specializc'd simplc'x algonthms 
for solving ru'twork probk'ms (Tor example, sec [3] for NET, |(>) for GN, and 
[ 6 , ll] for Cr.). This rc’sear c'h 1 ms, in turn, been c'Xj)loited to dc'v clof^ faeton/e'd 
optimization s>'sterns v. hicdi solve geiKuml LP problems with a si'l of rov. s ('xhibil- 
ing NET structure [13], GN structure [IBJ, and GT structure' [12,19]. Evi'ii more' 
recently, optimization systems hav^e been tested which use direct, factorization 
[19] or primal and/or dual decomposition [1-1] to c'\[)lod ('inbeddc'd GN strue- 
turn'. 

Now^ that software is available to schx' G\ (and GT) pi’oblerns [t)|, it is verv 
Lkely that sc'veral research groups will exploit GN m vanous we.\'s in the lu at' 
future. To support thi^ research, we are interested in cfTicic'iilly rev] automati- 
cally idenlib ing G\ siruc'ture of the following \-aric'ties in gt ru'ial bP ('oefTic'ient 
m.itric^'s: 

G\c .A subsc't of liP ccdumri"^ whic h arc' GN. or 

GN[^ A subset of hP r ows whic'h arc GN, or 

GNrc An c'mbeiidc'fl G\ williin a. sut.)'<^'t of the' rows and ('oluniris of 1,[\ 

Rceau'^e tlie cfTieicrr'y of solving a gg'iu'ral LP with G N-c'XfNioil ing mc'lhods is 
enh incc'd if ttic' GN structure' is Lngc', Triaxinium GN slruc'lures ar c' cur goal. 
Th.' k ids to I lie maximizat ion probk'ms describc’d below. 

Led bc' the.' mxn eoi'tTicic'rit matrix of liP, and ket be the 

a^ -oeialed 0-1 mc'idc'rice matrix for A. Ttu' three maximization probk'iiis, fornun 
latc'd a^ intc'gf'r programs, are 

d/(G\d ): m IX V 

^ ) 

s t V Cj < 2 for all j 



wb.L'rc Cj is a bmiiry decision variable ipallc^lting inclusion of column j in GN/; 
.'j'(GNk): max V r. 

^ I 
X 

r, e S0.1], 

where is a binary decision v.irinble indicnling incluraon of rov.' i in CNj,^; and 
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A/(GNr.c): max 2 n + ^ ^3 

tC ,1^ ^ j 

2 ^j'^i '^jCj < 2 +mj for all j 

i 

n, Cj e |0.1{. 

where and Cj are binary decision variables indicating respective inclusion of 
row i and column j in GNr c. where rrij = 2 ^^te that our definitions 

i 

of maximum GN factorizations are expressed simply as the sum of the rows 
and/or columns included. 

Much work has been done on the development of algorithms to identify spe- 
cial substructures in LPs. Previous work in identifying GUB subsets of con- 
straints is well known [4.7]. Brown and Wright [8] have explored ways to identify 
NET subsets. Extraction of hidden NET structure with general linear transforma- 
tions has been discussed by Bixby and Cunningham [2] and by Musalem [20]. 
Identification of GN row sets and other structures has been proposed by Sehrage 
[ 21 ], 

The problems of identifying maximum GUB and NET constraint subsets are 
NP-eomplete, and consequently, exact solutions cannot be guaranteed to be 
obtained quickly. Since GUB and NET constraints eire special cases of GN con- 
straints, it is to be expected that exact solutions of the GN identification prob- 
lems will also be difficult to obtain. We show that the GN identification problems 
are, in fact, NP-complete, but also give effective and reliable heuristic algo- 
rithms for them. 

In section 2, the complexities of the three maximization problems are 
investigated. //(GNh) and Af(GNp,c) are shown to be difficult so, in section 3, 
efficient algorithms are developed for finding approximate solutions to these 
problems. Four specialized integer programming heuristics are described for 
identifying maximal GNp sets. Two of the algorithms are "addition" heuristics 
which begin with the empty GNr set and successively add rows while maintaining 
feasibility. The other two algorithms arc "deletion" heuristics which begin with 
an infeasible GNr set and successively delete rows until a feasible set is found. 
Algorithm GNRC for ^/(GNpc) takes as input the GNp set found by any one of the 
the GNr heuristics. Then, it successively adds rows which introduce the least 
amount of weighted infeasibility and drops those columns where an infeasibility 
results. In this way. a sequence of GNrc sets is produced and the maximum of 
these taken to be the heuristic solution to A/(GNhc)- After the algorithms are 
presented, computational experience is given in section 4. 

2. Comfdexily 

In this section we investigate the complexity of //(GNq), /k/(GNp), and 
//(GNr.c)- /^(GNc) is trivially solvable in polynomial time by choosing all columns 
with at most tw^o nonzero elements in them; consequently, its complexity will not 
be discussed further. The other two problems arc more interesting. 

Following standard practice, /^/(GNr) and A/(GNrc) t>e studied with 

respect to their associated decision problems: 

Z?(GNr): Does there exist a set of rows R in II such that, for positive integer 

k <m, 

I I ^ A: and ^ Rij ^ 2 for all j ? 

ici? 
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Z?(GNrc)* Does there exist a set of rows R and columns C in H such that, for 
positive integer A: <771+71. 

I /? I + I C I ^ A: and £ — 2 for all j e C ? 

iei? 

Of course, a polynomial algorithm for one of the above decision problems would 
imply a polynomial algorithm for the associated maximization problem using, 
say, a binary search on the values of k . 

We consider the complexity of Z^(GNrc) first. Yannakakis [24] investigated 
the problem of finding the least number of nodes which can be deleted from a 
bipartite graph such that the resulting induced subgraph has a particular pro- 
perty. Restated in terms of the decision problem, he gives the following 
theorem on 0-1 matrices as a corollary of his results on graphs. 

Theorem 1: Let Q be any class of 0-1 matrices which is closed under permuta- 
tion and deletion of rows and columns. Let H be an mxn 0-1 matrix, and let k 
be some positive integer, k < tti+ti. Then, finding an ttioXtiq submatrix IIq of 11 
such that Ho ^ Q and mo+no ^ k is polynomial if the matrices of Q have bounded 
rank and is NP-complete otherwise. 

It is assumed above that membership in Q can be determined in polynomial time 
for a matrix of bounded size (otherwise, NP-hardness v.^ould be implied). 

This theorem is impressive in that it handles the NP-completeness question 
for 0-1 matrices in a wholesale fashion. The NP-completeness of Z?(GNrc) follows 
as a simple corollary. 

Corollary 1: Z?(GNr c) is NP-eomplete. 

Proof: Let Q be the class of 0-1 matr ices with at most two Is in each column. Q 
is obviously closed under permutation and deletion of rows and columns; 
matrices of arbitreirily large rank can be found in Q, and membership in Q can 
be determined in polynomial time. Z^(GNrc) for the incidence matrix II i.s 
equivalent to searching for an 7rioXno submatrix Hq of H such that Uq^Q and 
mo+7io > k, Ttiercforc, by Theorem 1, /^(GNr c) is NP-complcte. “ 

A 0-1 matrix H is represented as a bipartite graph with nodes on one side of 
the bipartition corresponding to rows, nodes on the other side of the bipartition 
corresponding to columns, and an edge (ij) for each h^j =1. D{GNi{) 
corresponds to a node-deletion problem with deletions restricted to one side of 
the bipartition; Yannakakis's results do not directly apply since they pertain to 
node deletions on either side of the bipartitionL Therefore, we use a problem- 
specific proof to show that /^(GN^) is NP-complete. 

Lemma 1. £?(GNr) is NP-eom.plete. 

I^of: For ease of representation, Z^(GNf^) will be equivalently stated in matrix 

iiolalioii: 

i?(G\i^): Does there exist a binary 7?i -vector x such that lx> k and H^x < 2? 

V{G\2) obviously in NP. We show that it is NP-completc by a transformation 
from the "Exact Cover by 3-Sets" problem [15], as specialized by Garey and 
Johnson [ 10]. 

Z^(.X3C): Does there exist a binary p-vector y such that ly = q and Ny = 1 

where N is a 3gxp, 0-1 matrix with exactly three Is in each column 
and at most three Is in each row? 



Bartho.di [1] has addressed tlLs topic, but his results are incorrplete. P^or instance, 
without additional restrictions, Piis Theorem 2 v/ould imply that Z?(GNc) is NP-complete. 



-5- 



For each row i in N with only one 1 or two Is, augment N with one or two unit 
vector columns respectively. Since none of these columns could be included 
in an exact cover of size q, D{X3C) is equivalent to 

D{X3C): Does there exist a binary vector y' of length jd +Z such that ly' = q 

and (E,N)y' = 1 where E corresponds to I augmenting columns? 

By construction of D{X3C), no set of columns of cardinality less than q could 
ever cover all the rows exactly once let alone more than once. Thus, D(X3C) is 
equivalent to a "minimum cover problem" 

D{UC): Does there exist a binary vector y such that, ^y ^ q and (E,N)y' > 1 ? 

Let x=l-y. Since each row contains exactly three Is, D{UC) is equivalent to a 
"maximum uncover problem" 

Z^(MUC): Does there exist a binary vector x such that Ix^pi-L-q and 

(E,N)x< 2? 

Since all above transformations are of polynomial complexity, and since Z^(MUC) 
is an instance of D{GK\{), is NP-complcte. ■ 

3. Algorithms 

The complexity results of the preceding section indicate that sohing 
j^/(GNr) and ^/(CNk c) exactly could be very timc-consumins;. Therefore, heuris- 
tic algorithms have been developed for obtaining approximate solutions. W'c 
describe the algorithms for A/(GNr) first. 

//(GNf^) IS an integer programming problem of the form max cx s i Axr<b, 
X binary, where all data is nonnegative. Thus, integer programming heuristics 
seem appropriate for attacking this problem. Two basic heuristic techniques 
exist for solving such integer programs which w^c label "addition" heuristics and 
"deletion" heuristics. An addition heuristic begins with the feasible solution x=0 
and successively sets to 1 that variable which which myopically maximizes 
efTective profit. The efTeclive profit associated with is c^V (pj , uhere is a 
penalty whose definition varies bet^veen heuristics, but which in some w^ay 
reflects the units of feasibility used up by setting Xj to 1. The addition hcurdstic 
stops when when no additional variables can be set to 1 without violating feasibil- 
ity. A deletion heuristic begins with the usually infeasible solution 1 and suc- 
cessively sets to 0 that variable Xj w^hich myopically minimizes loss of efTective 
profit Here, pj is a penalty which reflects the amount of infeasibility 

currently being contributed by Xj-\. The deletion heuristic stops when w^hen a 
feasible solution is obtained. 

We have specialized two addition heuristics and two deletion heuristics to 
j^/(GNr). The addition heuristics begin with an empty GH]^ set and successively 
add row^s to the set until a maximal set is obtained. The dciclion h^airistics 
begin with an infeasible GNh set consisting of all the row's, and rows aim succes- 
sively deleted until a feasible set is obtained. Since a set obtained by dele- 

tion may not be maximal, a second phase, an addition phase, is appended to 
insure that the set is maximal. To further expand the GN;< set found, it is possi- 
ble to devise post-maximal techniques similar to the 2-opt, 3-opt and general h- 
opt procedures used in traveling salesman heuristics, c.g., [1G,17]. Application 
of such techniques was unwarranted, how^ever, since computational results in 
section 4 show that excellent approximate solutions wric obtained using t.he 
basic addition and deletion heuristics. 

The addition heuristics are described by Algorithm GNRa, with variations 
"Greedy" and "Toyoda." The effective profit associated v/ith adding row i to the 
GNr set is 1/ RPi where RPi is a row penalty derived from the current nonmaxi- 
mal solution, the nonzero elements in the row and feasibility requirements. 
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Thus, ai each step of the algorithm, the row with the smallest penalty is added 
to the GNr set. Feasibility is maintained by setting to infinity the row penalty of 
any row whose addition would cause an infeasibility. In the Greedy variation, 
HPi equals the number of nonzero elements in the row if the penalty is finite. 
The Toyoda variation is a modification of an integer programming heuristic 
developed by Toyoda [23]. In this heuristic, the finite row penalty RPi is based 
not only on the number of nonzero elements in the row, but also on how close to 
feasibility limits addition of the row would bring the current solution. 

The deletion heuristics are described by Algorithm GNRd, with variations 
"Dobson’’ and "Senju & Toyoda." In this algorithm, each row has a penalty KF\ 
which, roughly speaking, indicates how much infeasibility the row is contribut- 
ing. \/ RPj, is the loss in efTective profit if row i is removed from the GNy? set. 
Thus, this algorithm successively deletes rows with maximum penalty to minim- 
ize the loss of effective profit. 

Dobson [9] analyzes and gives worst-case performance guarantees for an 
addition heuristic for integer programs of the form min cx, s.t. Ax^b, 0<x<u, x 
integer, where all data is nonncgativc. By simple substitution of variables, how- 
ever, the Dobson heuristic may be interpreted as a deletion heuristic for prob- 
lems in the form of //(GNp). At each deletion step of this heuristic, RP^ is the 
number of nonzero elements in row i which are contributing to an infeasibility. 
If m^ is the optimal solution to A/(GNr) and is the heuristic solution obtained 
by deletion only, Dobson's worst-case bound on performance is 

d 

(m ~mj))/[m - mo) ^ X] where d is the maximum number of nonzero 

h-l 

elements in any row. This is the only performance guarantee knowTi for any of 
the heuristics implemented in this paper. Unfortunately, the upper bound on 
mo this yields is rather weak in practice (See Table 3.). Any addition heuristic 
may be used as a second phase for a deletion heuristic, but for the Dobson dele- 
tion heuristic, we chose the greedy addition heuristic as the second phase since 
the definition of RPi is consistent between the two phases. 

The second variant of GNRd is a specialization of the heuristic devised by 
Senju and Toyoda [22] which those authors label an "efTective gradient method." 
For A/(GN^), JI^ maps the set of feasible r values into the n-dimensional hyper- 
cube whose sides are of length 2. At every step of the algorithm, given current 
infeasible solution r, h^, where the jth element of (lI^r-2)^ is 

n 

mdx Jo, RP-^ may be interpreted as the length of the projection of 

t= 1 

the vector onto the shortest vector extending from the point Il^r outside of 
the hypcrcube to the boundary of the hypercube. The modified Toyoda addition 
heuristic is used as the second phase of this heuristic. 

Tlj^' lv\o algonlhms G\Ka and GNRd, with their voiiations, ar-e outlined as 
follows: 
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Algorithm GNRa 

Input: The LP coefficient matrix A. 

Output: A set of row indices 7 r corresponding to the largest GNr set found in 

A 

Comment: The basic algorithm is the "Greedy" addition heuristic. Tlic modified 
"Toyoda" heuristic is obtained by substituting the statement in 
square brackets for its predecessor. 

Step 0. "Initialization" 

Initialize: 

(a) 7 = 0 and 7' = 1 1,2 m 

(b) For each column a column bound 

CBj = 2 

Comment: CBj is the number of elements column j may con- 
tain. 

(c) For each 7e7', a row penalty 
Step 1. "Row^ Addition" 

Let RP - RF^ be the smallest row penalty (corresponding to rowsc7'). 
If RP < then 

(a) Move s from 7' to 7. 

(b) For each column j such that 0, 

(i) Let CB^ ^ CB^ - 1. 

(ii) If CBj = 0 then for each such that a^j A 0, let 

RF, = 

(ii) For each i^s such that ^ 0, if CBj-l then let 
RF^ = RP^ + L else let RP, = 

(c) Repeat Step 1. 

Step 2. "Termination" 

Print III - I and STOP. 

End of Algorithm GRRa 
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Algorithm GNRd 



Input: The LP coefficient matrix A. 

Output; A set of row indices /r corresponding to the largest GNr set found in 

A 

Comment: The basic algorithm is the "Dobson" heuristic. The "Senju and Toy- 
oda" heuristic is obtained by substituting the statements in square 
brackets for their predecessors. 



Step 0. "Initialization" 

Initialize: 

(a) f - [ 1.2, . . . , m { and /’ = 0. 

(b) For each column j , a column penalty 

CPj=i s 0-2. 

id 

Comment: CPj is the number of "excess" elements in column 
J- 

(c) For each le/, a row penalty 

m = E 1- 

CPy>0 

Comment: RPi is number of units of infeasibility which row i 
is currently contributing. 

(c) For each ic/, a row penalty 

Y. cPj. 

a 

CFj>0 

Comment: RP^^ is the sum of excess elements in columns with 
^ a nonzero entry in rowi. 

Step 1. "Row^eletion" 

Let RP ~ RPi be the largest row penalty (corresponding to row Ze/). 

If RP > 0 then 

(a) Move L from I to /'. 

(b) For each column j such that / 0 

(i) If CPj = 1 [If CPj > O] then, for each i/s such that 

/ 0. let hT, - R;\ - 1. 

(ii) Let CPj - CPj - 1. 

(c) Repeat Step 1. 

Step 2. "Rov/ Addition Penalties" 

For each i^P, compute a row penalty 



RP, 




if CPj < 0 for all a^j / 0 
otherwise. 



2 {CPj + 3) if CPj < 0 for all a^j ^ 0 



RP, - 



othenvise. 
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Step 3. "Row Addition" 

Let RP = RP^ be the smallest row penalty (corresponding to row seP). 

If RP < oo, then 

(a) Move s from /' to I. 

(b) For each j such that t 0. let CP^ = CP^ + 1. 

(c) Go to Step 2. 

Step 4. "Termination" 

Print /r = / and STOP. 

End of Aigontiim GNRd 

The execution times of the above algorithms and the other algorithms 
described in this paper are quite short if proper data structures are used. The 
initial computation of the row and column penalties can be made very quickly if 
the nonzero entries in each row and column are stored in a linked list. Column 
penalties are then updated in a single pass of a row. Because of sparsity, row 
penalties can usually be updated in passes through just a few columns. 
Efficiency is further improved if row and column partitions are maintained with 
an indirect address array which allows contiguous access. Associated with this 
mapping array, a second array expresses the inverse map to speed updating. 

An easily computable upper bound on A/(GNr), denoted is useful for 

checlcing the efficacy of the above algorithms. Algorithm UBR is designed for 
this purpose. Let Ai and As be a partition of the rows of A and let 2 , z i and zn be 
the solutions to i^f(GNR) on A A^ and As, respectively. If is any valid upper 
bound on ^/(GNr) for Ai. then 

2 < 2 j + 2s ~ ^^B X + 2s- 

Algorithm UBR iteratively applies the above statement, computing the simple 
bound VBx and letting A = As after each iteration. Tliis is repeated until all 
columns of As have at most two nonzero elements in them at which point 22 is 
equal to the number of rows in Aa. BBi^ is then given by the sum of the UB^ 
upper bounds found at each iteration plus Zn found at the last iteration. At each 
iteration, A is partitioned with respect to that column j having the maximum 
number of nonzero entries. Aj is all rows of A with ^ 0 and UBx - Zx~ 2 since 
column j has only nonzero elements in A^. 



Algorithm UBR 

Input: The LP coefficient matrix A 

Output: A value UB\i, an upper bound on | 7 r1 . 

Step 0. "Initialization" 

Initialize: 

(a) / = {1,2, . . . ,ml, and UB^ = 0. 

(b) For each column j , a column count 

ccj = ( E 1). 

ic/ 

Step 1. "Iterative Partitioning" 

Let CC = CQ be the largest column count {corresponding to column 1). 
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U CC >2 then 

(a) Let UBn ^ UB^ -h 2. 

(b) For each iel such that 0, 

(i) Delete 1 from /. 

(ii) For each j such that ^ 0, update column count letting 

cq = cq - 1 . 

(c) Repeat Step 1. 

Step 2. "Termination" 

Print rZ?R = UByi + |/| and STOP. 

End of Algorithm UBR 

Algorithm GNRC, the heuristic for -A/(GNr c)« is outlined next. Any one of the 
integer programming heuristics described for A/(GNr) could be applied to this 
problem. However, these algorithms will normally give only a single answer to 
the problem; our algorithm allows the exploration of a complete trajectory of 
maximal GNrq sets beginning with GNr and ending with GN^. Our algorithm 
begins with the set of rows /r found in Algorithm GNRa or GNRd and repeatedly 
altempts to expand this set by deleting columns, always saving the largest GNrc 
set found. This approach was suggested by manual analysis of several problems 
for which the GNr set is limited by a fevv^ key complicating columns. Deleting 
these columns produced a much larger embedded GNrq set, and motivated 
development of a new factorization LP code which effectively exploits GNrc 
structure [ 19]. 



Algorithm GNRC 

Input The LP coefficient matrix A and a GNr set /r, |/r| <m, e.g., 7/^ from 
Algorithms GNRa or GNRd. 

Output: A set of row indices Inc and a set of column indices Jn,c corresponding 

to the largest GNrc structure found in A 

Siep 0. "Initialization" 

Initialize: 

(a) 7 = 7 r. r = \l,2 in] - I, J = \1^2 nj, 7 r.c = 7. and 

Jr,c - J • 

Comment: 7 cind J are the current sets of row and column 
indices while 7 r c and Jr c store the best sets found. 

(b) For pprh column j^J , a column penally 

Y, 1 ) - 2 . 

tc7 

Comment: These column penalties remain as an artifact of 
Algorithm GNRd and can be defined as input. 

(c) For each i^I\ a row cost 

pCi = s 1- 

CP^=0 

Comment: 7?Q is the number of columns which must be 

deleted if row i is added to 7. 
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step 1. 



Step 2. 



"Column Deletion" 

Let RC = RCs be the smallest row cost (corresponding to rowse/'). 

(a) For each je/ such that (isj ^ 0, 

(i) Let CPj = CPj + 1. 

(ii) If CPj = 1 then delete j from J and for each ie/' such 
that Oij ^ 0, update row costs letting RCi = RC^ — 1. 

(b) Move s from /' to I . 

"Row-inclusion Penalties" 

For each ie/', compute a row penalty 



RPx 



^ {CPj -f l) if CPj < 0 for all o^j ^ 0 
°° olherwise 



Step 3. "Row Addition" 

Let RP = RPs be the smallest row penalty (corresponding to row se/'). 
If RP <0 then 

(a) Move s from /' to /. 

(b) For each j^J such that a^j 0 

(i) Let CPj ^ CPj + 1. 

(ii) If CPj = 0 then for each ie/' such that a^j ^ 0, update 
row costs letting RCi - RCi + 1. 

(c) Go to Step 2. 

Step 4. "Incumbent Test" 

If I / 1 + 1 1 / I > I /r.cI + I ‘^R.cl then let In c - ^ Jn,c ~ ^ • 

Step 5. "Termination" 

If I / I < m , then go to Step 1. Otherwise, print /r ^ R.c ^^d STOP. 

End of Algorithm GNRC 



A stronger test, allowing preemptive termination, is possible at Step 5: If 

I / 1 < m and m -t | / 1 > |/r,c 1 1‘^acl- However, the weaker test permits the 

exploration of a complete trajectory for GNr c discussed above. 

Along the lines of UBn, an easily computed upper bound on -^/(GNrc). 
denoted UBn,c> developed to check the accuracy of GNRC. Partition A as fol- 
lows: 



A = 



^11 I ^12 

4l I 



Let z, and 222 be the solutions to ^^(GNr) on A and A 22 . respectively, and 
let UB^i be any simple upper bound on A/(GNrc) ^ 11 * Then 



2 ^ Zh + 222^ + Z22. 



Algorithm UBRC computes UB^x by iteratively applying the above statement, 
computing the simple bound UBn and letting A= A 22 after each iteration. This 
is repeated until all columns of A 22 have at most two nonzero elements in them 
at which point 22 is equal to the number of rows plus the number of columns in 
Ag 2 - PPr.c is then given by the sum of the UBi upper bounds found at caeh 
iteration plus 222 found at the last iteration. If is selected such that it con- 
sists of single column and three rows, all vnth nonzero elements, then 
UBii-Zii = 3. Computational experience has indicated an effective rule for 
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selecting the partition: Among all columns in A having at least 3 nonzeros, select 
that column having the minimum number of nonzeros, and within that column 
select the first tiiree rows with nonzeros in them. If k partitions are carried out 
before Azz becomes a GN matrix, it follows that: 

= 3A: + I 7221 + 1^22 1 = 3A:+(|/|-3A:)+(| /l-A:) = |/| + | 

The last equality is used in computing UBy{,o. 



Algorithm UURC 



Input: The LP coefTicient matrix A 

Output: A value an upper bound on | /r.cI + I ^R.cl • 

Step 0. ’initialization” 

Initialize: 

(a) / = \ , mj. and UB^^c ~ |7 | + |c7 1. 

(b) For each column j , a column count 

CC^ = ( 1 ] 1 )• 

Oil 

l\/ 

Step 1. 'iterative Partitioning” 

Let CC = CQ be the smallest column count greater than 2 
(corresponding to column s). 

If no such column exists, go to step 2. Else, 

(a) Let UB^^ q = “ L 

(b) For exactly three ir/ such that / 0, 

(z) Delete z from /. 

(a) For each j such that a^j / 0, update column count letting 
CC) = CCj - 1. 

(c) Repeat Step 1. 

Step 2. ’’Termination” 

Print CB^q and STOP. 

End of Algorillim UBR 



A. Computation al Experience 

1 hf* algorithms dc'scribcd in sc'ction 3 have been iniplernented in FOKIiiiAK, 
using the X-System [5j as the host optimization package. Table 1 identifies 
twenty-three LP and mixed integer programming (MIP) problems which have 
been collected from various sources over the years. Some of these models are 
vf-ry well known, e.g., Dantzig's PILOT and the U.S. Department of Energy's P/iD 
and PIES, and most of them were sent to us because of their difTiculty, solution 
expense, or outright solution fadlure on commercial optimization systems. Table 
1 shows problem dimensions excluding right-hand sides and objective functions. 
Computation times displayed in Tables 2-4 are computc-seconds, accurate to 
the precision shown, for FORTRAN IV H (Extended) with 0ptimize(2), run on IBM 
3033AP under VM/CMS. 
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Table 1. LP/MIP Problem Set 


Problem 


Constraints 


Variables 


Nonzero Els. 


Model 


AIR 


170 


3,040 


6,023 


Physical Distribution 


ALUMINUM 


4,045 


6,805 


27,917 


Econometric Production k Distribution 


COAL 


170 


3,753 


7,506 


National Elnergy Planning 


CUBICl 


657 


3,074 


15,894 


Combinatorics Problem 


CUBIC2 


2,669 


11,905 


63,361 


Bigger Combinatorics Problem 


CUPS 


360 


618 


1,341 


Production Scheduling 


ELEC 


784 


2,800 


8,462 


Energy Production k Consumption 


FERT 


605 


9,024 


40,484 


Production k Distribution 


FOAM 


999 


4,020 


13,083 


Production Scheduling 


FOOD 


4,010 


14,409 


23.332 


Production, Distribution k Inventory Planmng 


GAS 


788 


5,541 


31,020 


Production Scheduling 


JCAP 


2,486 


3,619 


9,510 


Production k Slnpment Scheduling 


LANG 


1,235 


1,425 


22,028 


Equipment k Manpower Scheduling 


NETTING 


89 


190 


388 


International Currency Exchange 


ODSAS 


4,647 


4,995 


30,832 


Manpower PlanrLng 


PAD 


694 


3,297 


15,541 


Energy Allocation, Distribution k Consumption 


PAPER 


2,863 


5,318 


23,746 


Econometric National Production 


PIES 


662 


3,011 


13,376 


Energy Production k Consumption 


PILOT 


974 


2. 172 


12,927 


Energy Development Planmng 


REFINE 


5,220 


5,994 


40,207 


Oil Refinery Model 


STEEL 


831 


1,276 


9,608 


Eeonometrie Production k Distribution 


TRUCK 


220 


4,752 


30,074 


Fleet Dispateli (Set Cover) 


WADDING 


2,991 


15,001 


82,708 


Multieomunodity Prod, k D.stribution Pianrung 



Algorithms GNRa and GNRd were used to identify GN^ rows with Algorithm 
UBR used to give an upper bound on the total number of such rows. To check 
accuracy, we attempted, within budget limitations, to solve exactly the integer 
linear programs for in those cases where | /pj <UBn . (V/e were success- 

ful in all but one case, as seen. Times for solving the ILPs averaged 214.1 
seconds for those problems solved.) Results for GNRa and GNRd, given in Table 2, 
are (a) the size of the optimal GNp set found by the ILP, (b) the size of this set as 
a percentage of total problem rows m, (c) the size of the GNp set found by GNR, 
(d) the size of this set as a percentage of the ILP optimum, and (e) the time 
required by the algorithm. For GNRd, the column labeled ] /p| uses the notation 
a:b where a is |/p| and b is the number of row^s in /p which were gained in the 
addition phase of the heuristic. Problems are weighted equally in computing 
average percentages in the "Totals" row of the table. Times listed do not include 
input or output. 

All GNR variants perform quite well. The addition phase m GNRd did not 
often contribute a significant fraction of the GN rov/s found, but the additional 
rows found helped make both GNRd variants slightly better than either of the 
GNRa variants. The best algorithm on this problem set, GNRd (Senju Sc Toyoda), 
finds an average of 99.1% of the maximum GNp set on those problems which we 
can solve exactly. The GXp sets average 62.3% of the total problem rows on 
these same problems. GNR computation times are nominal compared with 
actual solution times of the seminal LPs and MIPs. 



Table 2 Results for Algorithms GNRa and CNRd 
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Note: NA indicates IP solution not available. (LP optimum is 85.) 
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Results for UBR, given in Table 3, include (a) the size of the optimal GNr set, 
(b) the upper bound, (c) the upper bound as a percentage of the ILP optimum, 
and (d) the time required to find the upper bound. For comparison, we include 
(e) Dobson's upper bound labeled and (f) that bound as a percentage of 

the ILP optimum. Table 3 also displays some properties of GNf^ as found by 
GNRd, Senju &: Toyoda. These properties include (g), the number of disjoint 
embedded GN components, (h) the largest and smallest components, (i) the 
number of null columns, and (j) the number of singleton columns. These proper- 
ties are of interest since the structure of the embedded generalized network 
afTecLs tiie solution lechnKjues used in an LP factorization. Foi* e..amplc\ eoiii- 
ponenls consisting of single rows may be handled most cffieienlly without utiliz- 
ing a complete generalized network code. 



Tobic 3 GN^R Features 



Prcblc-ni 


ILP Opl 


AJgcnlhn UBR 
Z Opt Time 


Cob:jon Bound 
Z Cpt 


Tclul 


Embedded Cl 

Larji-JA 

(m 1 n ) 


Compor.cnlG 

SnMic.-jt Null 
(m+n.) CcIj 


Sim 

Col'j 


AIR 


170 


170 


ICO 


.0 


170 


ICO 


1 


1701 5.040 




0 


57 


ALUMINUM 


2.193 


2,211 


ICO 7 


1 7 


3.793 


172 8 


1 15 


i.naio.ioi 


1 f 1 


0 


1.231 


CC.\L 


170 


170 


ICO 


0 


170 


ICO 


1 


17C1 3.753 




0 


0 


CUBIC 1 


312 


! 32-1 


103 8 


2 


505 


150 7 


30 


15Ci I.71G 


lf2 


121 


C12 


CUBIC2 


1.264 


1.332 


105 1 


2 7 


2479 


19G 1 


119 


5G'N G.C31 


If G 


353 


2.483 


CUPS 


333 


33G 


ICO 


0 


3. 3 


1 CG 0 


13 


tCi 1C2 


12 f 12 


72 


71 


ELEC 


520 


521 


ICO 9 


2 


705 


13^ 6 


11 


7H ICO 


2f IG 


18 


171 


PERT 


572 


572 


ICO 


1 


COO 


IC^ 9 


1 


5121 9.021 




0 


l.VwV 


FC/ai 


951 


C57 


ICO G 


C 


901 


1 0 r 2 


1 1 


31 If 1.321 


11 1 


H 


1.1-1 


FOOD 


3.71G 


3,720 


lOQ 1 


1 


3039 


ICG 0 


75 


1.7351 7.1 :? 


If 4 


ct,r^n 


C.9Cj 


G.\S 


73 


71 


lOl 1 


1 


Cr2 


931 2 


1 1 


5211,711 


1 t 2 


330 


5. CIO 


JCAP 


1.013 


1.C31 


101 0 


2 


216.2 


213 1 


130 


1 i C + 1 C 0 


1 « 2 


02 


1.33 5 


L\NG 


7M 


72G 


101 7 


1 


1122 


157 1 


3 


701 f 1 .235 


H 2 


189 


31 1 


NTrmxG 


72 


72 


ICO 


0 


0 1 


1 IG 7 


17 


23 f-dC 


2U 


23 


953 


1 ODSAS 


1,198 


1.510 


ICO 8 


2 3 


1101 


270 1 


115 


70lf2.lC3 


1 1 1 


537 


1.C52 


'P>SD 


122 


1 22 


ICO 


0 


553 


4 57 5 


3 


1.351 


P.f 33 


1.730 


1.179 


1 p:^PUR 


1.030 


1,553 


10 1 5 


4 


2730 


11^ 7 


402 


2-'"H.G3l 


1 f 1 


C75 


1.721 


1 PIES 


253 


255 


102 8 


.0 


571 


153 3 


35 


14C» 1 .015 


112 


C.CG 


720 


1 PILOT 


170 


ICO 


10! 3 


1 


807 


1 .5 7 


70 


1771533 


H 1 


G18 


g:i 


Riii ii iE 


3 . i o 


3.1 VO 


lOi L> 


8 


iVf u 


1 .... 3 


5 / i 


1.5..'* ,,,....0 


1 ^ 1 




l.\ . - 


STEEL 


131 


458 


ICC 3 


1 


7G3 


177 C 


95 


180 r041 


If 1 


24J 


Z 1 ■■ 


TRUCK 


NA 


1C5 


N.\ 


.1 


197 


N.\ 


p 


6913, CCS 


If la 


l.TCo 


2.315 


WADDING 


2,211 


2.222 


ICC 5 


.2 


2uCG 


120 C 


3 


9G0I I.ICO 


If 1 


4.111 


5. 032 j 



is surprisingly tight, averaging 101.4% of the true maximum, and com- 
putation times are nom'n.4. Dobson's bound is poor, averaging 203.1% of ihe 
true maximum. The GN components found usually consist of a few large com- 
ponents and numerous smcdl components. 
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Table 4 gives the results obtained by Algorithm GNRC and Algorithm UBRC. 
Since no ILP optimum is known for jI/{GNr c) in most cases, the items displayed 
differ from those items displayed in Tables 2 and 3. The results reported for 
Algorithm GNRC are (a) the size of the of the GNj^c structure found, (b) the time 
in seconds required to find the structure excluding input and output, (c) the size 
of the GNrc a percentage of the total constraint matrix, and (d) the percen- 
tage of of the LP upper bound {UBLP^q) achieved by the algorithm. The results 
reported for Algorithm UBRC are (e) |/r.cI + I‘^r.cI as a percentage of UBr^c* and 
(f) the time required to obtain UB^c- For comparison, the last two columns of 
the table give the total number of rows and columns obtained for the GNr and 
GNc problems. These are the sizes of the embedded GN submatrices when res- 
tricted to row submatriccs and column submatrices, respectively. Each prob- 
lem is weighted equally to compute average percentages in the “Totals" row. 



j Table 4. GKr^q Results 






Algorithm GNRC 




Mg. UBRC 


GNr 


GNc 


’ Problem 


KrcI + I<^rcI 


Time 


% (m +71 ) 


% UBLPji Q 


% 


Time 




\Jc\^m 


1 AIR 


3,210 


.0 


100 


100 


100 


.0 


3,210 


3,210 


, ALUMII'TUM 


9,027 


13.6 


83.2 


91.7 


91.7 


2.3 


8,900 


5,508 


! COAL 


3,923 


.0 


100 


100 


100 


.0 


3,923 


3,923 


i CUBIC 1 


3,365 


.6 


90.2 


99.4 


94.8 


.4 


3,365 


659 


' CUBIC2 


13,096 


11.1 


89.7 


99.5 


94.7 


6.4 


13,096 


2,690 


i CUPS 


951 


.0 


97.2 


100 


99.7 


.0 


951 


713 


ELEC 


3,322 


.3 


92.7 


99.0 


98.3 


.2 


3,320 


1,042 


FERT 


9,596 


.3 


99.7 


100 


99.9 


.2 


9,593 


2.362 


FOAM 


4,971 


.1 


99.0 


100 


99.7 


.1 


4,971 


1,044 


rOOI) 


18,137 


.8 


98.5 


99.5 


99.4 


.1 


18,125 


17,860 


GAS 


5,920 


5.4 


93.5 


94.9 


94.5 


.2 


5,614 


848 


. JCAP 


5,822 


5.5 


91.9 


07.7 


99.8 


.2 


4,851 


5,718 


i LANG 


2,139 


1.1 


80.4 


97.8 


90.2 


.2 


2,139 


1,905 


1 NETTING 


262 


.0 


93.9 


97.8 


100 


.2 


262 


256 


. 00? AS 


7,556 


40.0 


78.4 


78.0 


86.1 


1.2 


6,470 


5,094 


1 PAD 


3,621 


3.9 


00.7 


98.8 


95.3 


.3 


3,419 


2,416 


1 PAPER 


7,388 


4.6 


89.9 


95.9 


96.2 


.9 


7,179 


4,905 


• PA'S 


3,313 


.9 


90.2 


99.5 


91.8 


.2 


3,299 


2,241 


iP. o: 


2,645 


1.4 


84.1 


95.7 


91.6 


.2 


2,634 


1,567 




9,326 


19.3 


63.2 


93.8 


92.4 


2.3 


9,101 


7,729 




1,700 


.9 


60.7 


91.5 


69.7 


.2 


1.695 


1,131 


TV 


4,82? 


.5 


97,0 


NA 


90.3 


.3 


4.02? 


220 


1 W/uJDING 


17,209 


8.3 


95.6 


99.7 


97.8 


1.0 


17,209 


14,451 


1 ’ 


141,321 


118.1 


91.3% 


93.8% 


95.6 


17.1 


133,232 


87,582 



\o{,e: NA indicates not available. 



GNRC performs very well, also. The algorithm finds a GNrc structure whose 
s /(' avprages 91.3% of the size of the total constraint matrix. The size of the 
structure averages 96.8% of the LP upper bound on those problems for which the 
bound ivas obtained. (Times to obtain the LP bound averaged 315.6 seconds.) 
Vyith respect to UBi^c» the GNrc set found averages 95.6%. Thus, the upper 
bound provided by algorithm UBRC is only slightly weaker, on average, than the 
LP upper bound. In addition. UBRC has more than a 400 to 1 computational 
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speed advantage over the LP upper bound making it very attractive. 

Additional computational studies have been performed to investigate the 
structures which GNR and GNRC obtain. Figure 1 summarizes this work for 
ELEC, JCAP, PAD, PIES and PILOT. The outer rectangle represents, to scale, the 
constraint matrix for each problem. The area above the dashed line represents 
the GNr set found by GNRd, Senju &: Toyoda. Within this area are indicated the 
connected eomponents found by a simple eonneetivity algoritfirn. As indicated 
previously in Table 3, a few large components are typically found together with 
numerous small components. The area to the left of the vertical line represents 
the GNc set. The irregular lines trace the trajeetoiics of the GNrc structures 
found by GNRC, ranging from GNr on the right to GNc sit the lower left. From any 
point on this trajectory, all rows and columns above and to the left form a GN 
set. The circle indicates the largest GNrc structure found on this trajectory. 

5. Conclusion 

Although GNc identification is easy, GNr and GNr c identification is theoreti- 
cally difficult. Hov/ever, maximal, and often optimal GNr and GNrc substruc- 
tures can be found in an LP constraint matrix using the heuristic algorithms 
developed here. In some problems, large GNr structures can be found, while in 
other problems, it is necessary to remove some columns to find a large embed- 
ded GNrc structure. Since execution time is modest for heuristic GN 
identification, our algorithms can be applied as a matter of course in general 
LPs to seek GN substructures. Evidence from the problem set indicates that 
this is well-advised if a GN-exploiting method is available: No members of the 
problem set were known, a priori, to contain significant GN structure and yet, in 
several cases, GN structure was predominant. 
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V'lgure 1 . Embedded Generalized Networks 
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Figure 1. (continued) 
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